Quantum Monte Carlo: 2006

Thursday, October 26, 2006

Schrodinger equation in the mass media

The Schrodinger equation appears in the background in the second half of Weird Al's video White & Nerdy.

The video also shows rolling dice (briefly), if one wishes to stretch its relevance to include Monte Carlo as well.

Friday, October 20, 2006

The Other QMC

Quasi-Monte Carlo is the other method commonly referred to by the QMC acronym (I will abbreviate it QuasiMC to minimize confusion). It's not really Monte Carlo since the sequence of points is not random, but the point sets do have a Monte Carlo-ish feel about them.

And in fact, it does share the property of arbitrary termination with Monte Carlo integration (consider a grid - you have have use all the points at a particular spacing. Stopping half way across the region of integration is not going to yield good results).

The reason people are interested in QuasiMC is that its convergence is better than the 1/sqrt(N) of Monte Carlo. It does this by using point sets that are more evenly distributed than random points - there are fewer clumps of points or large gaps that appear in random sequences (Search for 'Low-discrepancy sequences'. Here's the Wikipedia entries for a technical discussion and some sequences).

QuasiMC point sets are usually generated from sequences where the low-discrepancy condition can be verified theoretically, but it is possible to use the intuitive approach of making the points maximally spread out (see this presentation for an example).

The following algorithm will generate a set of N points that gives better convergence than random (at least it worked on a simple integrand in one dimension.)

compute some random points (>> N points)

pick a starting point and put it in the set S.

find the point that is the furthest from all existing points in S, and place that point in S.

repeat the step 3 until N points are selected.

(I'm not saying this an efficient way to generate the points, just that it can be done.)

For addition information, see also chapter 7.7 in Numerical Recipes

Friday, September 01, 2006

QMC wiki

Check out the QMC wiki at http://www.qmcwiki.org

It looks like a promising resource for the QMC research community.

Tuesday, June 27, 2006

Optimal histogram bin width

Kevin Knuth wrote a paper about finding the optimal number of bins to represent data in a histogram (Optimal Data-Based Binning for Histograms). He starts from a piecewise constant density model and finds the (Bayesian) posterior probability from this model (equation 36, which is actually the log of the posterior). The posterior function is then maximized to find the number of bins that best models the data.

The article also investigates the number of data points for a reliable estimation of the density. The recommendation is 100-150 points, if the distribution is Gaussian.

It would be interesting to apply this method to radial distribution functions. However the assumption of a constant volume for each bin is not met in this case. There are several ways this could be adjusted, but I'm not sure they are valid (scale each bin count by the volume, or use non-uniform bin spacing to maintain constant volume)

Alternately, the discussion references other algorithms for dealing with variable bin-width models (which may be better for resolving multiple peaks anyway).

Thursday, June 08, 2006

QMC derivation notes

I posted a document I wrote in grad school, Notes on the wavefunction and local energy. It contains derivations of various QMC formulas, particularly the first and second derivatives for several forms of wavefunctions.

I'm posting this for two reasons. The first is in case anyone finds the formulas useful when working on a QMC code.

The second is related to the process of scientific programming. When writing a QMC code, I found it useful to record the formulas and derivations in a neatly typset form. Then the next step involved turning the equations into computer code. (Then, of course, testing and debugging).

This workflow is what I would like to capture with the Progamming in Mathematical Notation work. The document with derivations could be written in content MathML (or something more amenable to human manipulation). Ideally the computer could then assist with verifying the derivations for correctness, and with converting the equations into computer code.

And as long as I'm dreaming, I'd really like a wiki-like interface for creating and editing such a document (making a set of hyperlinked pages rather than a single linear document)

Wednesday, April 05, 2006

Interesting article studying writing journal articles

Last fall I looked at methods for improving the efficiency of MC simulations, and was pointed to the article, Efficient Monte Carlo methods for the computer simulation of biological molecules by Djamal Bouzida, Shankar Kumar, and Robert Swendsen.

A search of the author's names on Yahoo turned up a fascinating article by Ann Beakeslee, who studied the process of writing this paper. The article examines the interaction of the graduate student (Bouzida) and the advisor (Swendsen), covers the process of learning to write scientific papers, and looks at difficulties Bouzida faced as a novice.

Wednesday, March 22, 2006

Search for chemicals

Check out chmoogle, the search engine for chemicals. Many of the results are for suppliers (not so useful for theoreticians), but some of the results contain more information about the chemical.

The "details" link by the chemical structure picture also has alternate names, along with links to search Google or Yahoo.

Monday, March 20, 2006

[APS] First Principles Molecular Dynamics on Blue Gene

Francois Gygi gave a talk about scaling a first principles molecular dynamics code to 65,536 processors on the Blue Gene/L at LLNL.
The core numerical algorithms that need to scale are the FFT and dense linear algebra. The FFT is limited in scalability to 512 processors per k-point. Fortunately, there are many k-points and they can be computed independently.

Modifying the assignment of tasks to processors increased the performance by 64%! FLOPS are free, it's the communication patterns that are determine performance.

One research problem is improving the scalability of computations for small systems (ie, 32 water molecules), so they can be simulated for longer times.

[APS] Solar Power

David Carlson (from BP Solar) gave a talk titled, The Status and Outlook for the Photovoltaics Industry. His presentation slides are also available online.

Some random facts I found interesting

The photovoltaic industry has been growing at 35% per year.

This year, the PV industry will use as much or more silicon than the semiconductor industry. This may cause problems with the growth curve, as suppliers ramp up to supply the PV industry.

The current cost per kWh is $0.18. Less than half of the cost is due to the PV module. (Slide 30)

Given the cost of PV electricity, the cost of conventional electricity, and the availability of sun, Spain is almost to the point of cost-competetive PV electricity. (Slide 31)

Monday, March 13, 2006

[APS] QC for QC

Intersting point from Alan's talk about using Quantum Computing to solve Quantum Chemistry problems: computing cholestrol is about as hard as breaking encryption algorithms.

The quantum computing approach takes quite a bit of setup - an actual Hartree-Fock calculation or more. The advantage of QC is that the exact answer (Full CI) can be obtained easily[1] by the QC from that starting point. (But going from HF to FCI is the hard part classically - I think Full CI scales exponentially in the system size)

[1] as much as anything with quantum computers can be called _easy_

[APS] Hydrogen storage

I went to a few talks in a session about Simulations of hydrogen storage. It appears that compressed gaseous hydrogen has about 10 times less energy density than gasoline. (Some quick googling found this explanation. It really matters if you're using volume or weight density - by weight hydrogen has a much higher energy density.)

John Tse talked about hydrate clathrates (there's lots of methane at the bottom of the ocean, and it may be a way to store hydrogen ).

One point I found interesting - the potential bewteen a hydrogen molecule and a graphene sheet fits a Lennard-Jones potential quite well (for one particular site or orientation).

Technical note - Doing simulations with DFT is not quite sufficient to get good energies, they used MP2 level theory. (Seems this was repeated in another talk). Question - are MP2 energies good enough, or is doing better just too expensive?

William Goddard presented theoretical evaluations of several approaches to hydrogen storage.

Metallo-Organic Frameworks
Mg nanoparticles
Metal hydrides
Carbon materials (nanotubes, some doped with Li)

In the Mg nanoparticle section, he talked about solving the inversion problem - designed the material to specifications. Their approach was find the potential (I think?) of a desirable atom, and find reals atoms to match.
(There was a similar idea presented by Alex Zunger, trying to find materials that matched specific band-related properties)

Wednesday, March 08, 2006

Monte Carlo methods in CISE

Monte Carlo methods are the theme for the March issue of Computing in Science and Engineering. There's a broad overview article, an article covering the geometric cluster method (contains descriptions of the lattice-based cluster algorithms as well), an article examining Markov chains arising from computer science, and an article on sequential importance sampling.

Saturday, March 04, 2006

March meeting

I'll be at the APS March Meeting.

I'm giving a talk in session V27 (on Thursday).

If anyone else going to the meeting wants to meet and talk about QMC, MC, computational physics, or other topics, feel free to email me.

Wednesday, January 18, 2006

Unbiased exponential estimators

The penalty method allows a Metropolis random walk to be correct even with the presence of noise in the probability function being sampled.

It eventually dawned on me that this same method can be used to correct any exponential estimator.

Imagine we wish to compute the estimator, exp(x), where x is some sample point that is contaminated with noise. Suppose the variance of the noise is sigma, and we know the noise is gaussian. Then the estimator exp(x-sigma^2/2) will be unbiased.

Sunday, January 15, 2006

Free Energy Links

Radford Neal has written a paper on computing free energy differences. The paper presents the Linked Importance Sampling method, but describes other methods as well. The paper contains references to both the physics literature and the statistics literature, and uses physics language.

If you're really interested in free energy differences, check out Arjun Acharya's thesis, Free Energy differences: Representations, estimators, and sampling strategies (also on arxiv). The review chapter covers several methods, but most of the thesis deals with the Phase Mapping technique, which increases the overlap between phases. It does not eliminate the problem entirely, however, and many of the other methods can be used to deal with the remaining issues.

[Edit 2/27/06: updated links from comments, and revised the entry]

Quantum Monte Carlo